DeKalb
Proposing a Framework for Machine Learning Adoption on Legacy Systems
Rahman, Ashiqur, Alhoori, Hamed
The integration of machine learning (ML) is critical for industrial competitiveness, yet its adoption is frequently stalled by the prohibitive costs and operational disruptions of upgrading legacy systems. The financial and logistical overhead required to support the full ML lifecycle presents a formidable barrier to widespread implementation, particularly for small and medium-sized enterprises. This paper introduces a pragmatic, API-based framework designed to overcome these challenges by strategically decoupling the ML model lifecycle from the production environment. Our solution delivers the analytical power of ML to domain experts through a lightweight, browser-based interface, eliminating the need for local hardware upgrades and ensuring model maintenance can occur with zero production downtime. This human-in-the-loop approach empowers experts with interactive control over model parameters, fostering trust and facilitating seamless integration into existing workflows. By mitigating the primary financial and operational risks, this framework offers a scalable and accessible pathway to enhance production quality and safety, thereby strengthening the competitive advantage of the manufacturing sector.
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.94)
- Banking & Finance (0.68)
- Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.68)
BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly Text
Azher, Ibrahim Al, Mokarrama, Miftahul Jannat, Guo, Zhishuai, Choudhury, Sagnik Ray, Alhoori, Hamed
In scientific research, ``limitations'' refer to the shortcomings, constraints, or weaknesses of a study. A transparent reporting of such limitations can enhance the quality and reproducibility of research and improve public trust in science. However, authors often underreport limitations in their papers and rely on hedging strategies to meet editorial requirements at the expense of readers' clarity and confidence. This tendency, combined with the surge in scientific publications, has created a pressing need for automated approaches to extract and generate limitations from scholarly papers. To address this need, we present a full architecture for computational analysis of research limitations. Specifically, we (1) create a dataset of limitations from ACL, NeurIPS, and PeerJ papers by extracting them from the text and supplementing them with external reviews; (2) we propose methods to automatically generate limitations using a novel Retrieval Augmented Generation (RAG) technique; (3) we design a fine-grained evaluation framework for generated limitations, along with a meta-evaluation of these techniques.
- North America > United States > Texas > Denton County > Denton (0.14)
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- North America > Canada (0.04)
- Health & Medicine (1.00)
- Education > Educational Setting (0.68)
FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article
Azher, Ibrahim Al, Mokarrama, Miftahul Jannat, Guo, Zhishuai, Choudhury, Sagnik Ray, Alhoori, Hamed
The future work section of a scientific article outlines potential research directions by identifying gaps and limitations of a current study. This section serves as a valuable resource for early-career researchers seeking unexplored areas and experienced researchers looking for new projects or collaborations. In this study, we generate future work suggestions from key sections of a scientific article alongside related papers and analyze how the trends have evolved. We experimented with various Large Language Models (LLMs) and integrated Retrieval-Augmented Generation (RAG) to enhance the generation process. We incorporate a LLM feedback mechanism to improve the quality of the generated content and propose an LLM-as-a-judge approach for evaluation. Our results demonstrated that the RAG-based approach with LLM feedback outperforms other methods evaluated through qualitative and quantitative metrics. Moreover, we conduct a human evaluation to assess the LLM as an extractor and judge. The code and dataset for this project are here, code: HuggingFace
- North America > United States > Texas > Denton County > Denton (0.14)
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- Europe > Greece (0.04)
- Education (0.67)
- Government (0.46)
Effective Defect Detection Using Instance Segmentation for NDI
Rahman, Ashiqur, Seethi, Venkata Devesh Reddy, Yunker, Austin, Kral, Zachary, Kettimuthu, Rajkumar, Alhoori, Hamed
Ultrasonic testing is a common Non-Destructive Inspection (NDI) method used in aerospace manufacturing. However, the complexity and size of the ultrasonic scans make it challenging to identify defects through visual inspection or machine learning models. Using computer vision techniques to identify defects from ultrasonic scans is an evolving research area. In this study, we used instance segmentation to identify the presence of defects in the ultrasonic scan images of composite panels that are representative of real components manufactured in aerospace. We used two models based on Mask-RCNN (Detectron 2) and YOLO 11 respectively. Additionally, we implemented a simple statistical pre-processing technique that reduces the burden of requiring custom-tailored pre-processing techniques. Our study demonstrates the feasibility and effectiveness of using instance segmentation in the NDI pipeline by significantly reducing data pre-processing time, inspection time, and overall costs.
- North America > United States > Kansas > Sedgwick County > Wichita (0.04)
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- North America > United States > Illinois > Cook County > Lemont (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
Public interest in science or bots? Selective amplification of scientific articles on Twitter
Rahman, Ashiqur, Mohammadi, Ehsan, Alhoori, Hamed
With the remarkable capability to reach the public instantly, social media has become integral in sharing scholarly articles to measure public response. Since spamming by bots on social media can steer the conversation and present a false public interest in given research, affecting policies impacting the public's lives in the real world, this topic warrants critical study and attention. We used the Altmetric dataset in combination with data collected through the Twitter Application Programming Interface (API) and the Botometer API. We combined the data into an extensive dataset with academic articles, several features from the article and a label indicating whether the article had excessive bot activity on Twitter or not. We analyzed the data to see the possibility of bot activity based on different characteristics of the article. We also trained machine-learning models using this dataset to identify possible bot activity in any given article. Our machine-learning models were capable of identifying possible bot activity in any academic article with an accuracy of 0.70. We also found that articles related to "Health and Human Science" are more prone to bot activity compared to other research areas. Without arguing the maliciousness of the bot activity, our work presents a tool to identify the presence of bot activity in the dissemination of an academic article and creates a baseline for future research in this direction.
- North America > United States > South Carolina > Richland County > Columbia (0.14)
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (14 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (0.70)
- Media (0.67)
- Health & Medicine > Therapeutic Area > Immunology (0.47)
Harnessing AI data-driven global weather models for climate attribution: An analysis of the 2017 Oroville Dam extreme atmospheric river
Baño-Medina, Jorge, Sengupta, Agniv, Michaelis, Allison, Monache, Luca Delle, Kalansky, Julie, Watson-Parris, Duncan
AI data-driven models (Graphcast, Pangu Weather, Fourcastnet, and SFNO) are explored for storyline-based climate attribution due to their short inference times, which can accelerate the number of events studied, and provide real time attributions when public attention is heightened. The analysis is framed on the extreme atmospheric river episode of February 2017 that contributed to the Oroville dam spillway incident in Northern California. Past and future simulations are generated by perturbing the initial conditions with the pre-industrial and the late-21st century temperature climate change signals, respectively. The simulations are compared to results from a dynamical model which represents plausible pseudo-realities under both climate environments. Overall, the AI models show promising results, projecting a 5-6 % increase in the integrated water vapor over the Oroville dam in the present day compared to the pre-industrial, in agreement with the dynamical model. Different geopotential-moisture-temperature dependencies are unveiled for each of the AI-models tested, providing valuable information for understanding the physicality of the attribution response. However, the AI models tend to simulate weaker attribution values than the pseudo-reality imagined by the dynamical model, suggesting some reduced extrapolation skill, especially for the late-21st century regime. Large ensembles generated with an AI model (>500 members) produced statistically significant present-day to pre-industrial attribution results, unlike the >20-member ensemble from the dynamical model. This analysis highlights the potential of AI models to conduct attribution analysis, while emphasizing future lines of work on explainable artificial intelligence to gain confidence in these tools, which can enable reliable attribution studies in real-time.
- Africa (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- (10 more...)
- Energy > Renewable > Hydroelectric (1.00)
- Energy > Power Industry (1.00)
- Government > Regional Government > North America Government > United States Government (0.46)
Learning causation event conjunction sequences
This is an examination of some methods that learn causations in event sequences. A causation is defined as a conjunction of one or more cause events occurring in an arbitrary order, with possible intervening non-causal events, that lead to an effect. The methods include recurrent and non-recurrent artificial neural networks (ANNs), as well as a histogram-based algorithm. An attention recurrent ANN performed the best of the ANNs, while the histogram algorithm was significantly superior to all the ANNs.
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
LightningNet: Distributed Graph-based Cellular Network Performance Forecasting for the Edge
Zacharopoulos, Konstantinos, Koutroumpas, Georgios, Arapakis, Ioannis, Georgopoulos, Konstantinos, Khangosstar, Javad, Ioannidis, Sotiris
The cellular network plays a pivotal role in providing Internet access, since it is the only global-scale infrastructure with ubiquitous mobility support. To manage and maintain large-scale networks, mobile network operators require timely information, or even accurate performance forecasts. In this paper, we propose LightningNet, a lightweight and distributed graph-based framework for forecasting cellular network performance, which can capture spatio-temporal dependencies that arise in the network traffic. LightningNet achieves a steady performance increase over state-of-the-art forecasting techniques, while maintaining a similar resource usage profile. Our architecture ideology also excels in the respect that it is specifically designed to support IoT and edge devices, giving us an even greater step ahead of the current state-of-the-art, as indicated by our performance experiments with NVIDIA Jetson.
- Europe > United Kingdom (0.14)
- Asia > China (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (12 more...)
- Telecommunications (1.00)
- Information Technology > Networks (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)
Analyzing Brain Activity During Learning Tasks with EEG and Machine Learning
Cho, Ryan, Zaman, Mobasshira, Cho, Kyu Taek, Hwang, Jaejin
This study aimed to analyze brain activity during various STEM activities, exploring the feasibility of classifying between different tasks. EEG brain data from twenty subjects engaged in five cognitive tasks were collected and segmented into 4-second clips. Power spectral densities of brain frequency waves were then analyzed. Testing different k-intervals with XGBoost, Random Forest, and Bagging Classifier revealed that Random Forest performed best, achieving a testing accuracy of 91.07% at an interval size of two. When utilizing all four EEG channels, cognitive flexibility was most recognizable. Task-specific classification accuracy showed the right frontal lobe excelled in mathematical processing and planning, the left frontal lobe in cognitive flexibility and mental flexibility, and the left temporoparietal lobe in connections. Notably, numerous connections between frontal and temporoparietal lobes were observed during STEM activities. This study contributes to a deeper understanding of implementing machine learning in analyzing brain activity and sheds light on the brain's mechanisms.
- North America > United States > Wisconsin (0.05)
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- Asia > India (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education > Curriculum > Subject-Specific Education (1.00)
Machine Learning Techniques to Identify Hand Gestures amidst Forearm Muscle Signals
Cho, Ryan, Patel, Sunil, Cho, Kyu Taek, Hwang, Jaejin
This study investigated the use of forearm EMG data for distinguishing eight hand gestures, employing the Neural Network and Random Forest algorithms on data from ten participants. The Neural Network achieved 97 percent accuracy with 1000-millisecond windows, while the Random Forest achieved 85 percent accuracy with 200-millisecond windows. Larger window sizes improved gesture classification due to increased temporal resolution. The Random Forest exhibited faster processing at 92 milliseconds, compared to the Neural Network's 124 milliseconds. In conclusion, the study identified a Neural Network with a 1000-millisecond stream as the most accurate (97 percent), and a Random Forest with a 200-millisecond stream as the most efficient (85 percent). Future research should focus on increasing sample size, incorporating more hand gestures, and exploring different feature extraction methods and modeling algorithms to enhance system accuracy and efficiency.
- North America > United States > Illinois > DeKalb County > DeKalb (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (2 more...)
- Health & Medicine > Consumer Health (0.68)
- Health & Medicine > Therapeutic Area > Neurology (0.47)